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Abstract 

A random variable (r.v.) X is said to follow Benford's law if log(A) is uniform 
mod 1. Many experimental data sets prove to follow an approximate version 
of it, and so do many mathematical series and continuous random variables. 
This phenomenon received some interest, and several explanations have been 
put forward. Most of them focus on specific data, depending on strong 
assumptions, often linked with the log function. 

Some authors hinted - implicitly - that the two most important charac- 
teristics of a random variable when it comes to Benford are regularity and 
scatter. 

In a first part, we prove two theorems, making up a formal version of this 
intuition: scattered and regular r.v.'s do approximately follow Benford's law. 
The proofs only need simple mathematical tools, making the analysis easy. 
Previous explanations thus become corollaries of a more general and simpler 
one. 

These results suggest that Benford's law does not depend on properties 
linked with the log function. We thus propose and test a general version of 
the Benford's law. The success of these tests may be viewed as an a posteriori 
validation of the analysis formulated in the first part. 
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1. Introduction 



First noticed by iNewcombl (Il88ll ) , and again later by iBenfordl (119381 ) , the 
so-called Benford's law states that a sequence of "random" numbers should 
be such that their logarithms are uniform mod 1. As a consequence, the 
first non-zero digit of a sequence of " random" numbers is d with probability 
log (1 + 2), a n unexpectedly non-uniform probability law. log here stands for 
the base 10 logarithm, but an easy generalisation follows: a random variable 
(r.v.) conforms to base b Benf ord's law if its base b logarithm log fe (X) is 
uniform mod 1. iLolbertl (120081 ) recently proved that no r.v. follows base b 
Benford's law for all b. 

Many experimental data roughly conform to Benford's law (most of which 
no more than roughly). However, the vast majority of real data sets th a t have 
been tested do not fit this law at all. For instance, IScott and Faslil (l200lh 
reported that only 12.6% of 230 real dat a sets passed the test for Benford's 
law. In his seminal paper, IBenfordl (119381 ) tested 20 data sets including lakes 
areas, length of rivers, populations, etc., half of which did not conform to 
Benford's law. 

The same is true of mathematical sequences or continuous r.v.'s. For ex- 
amp le, binomial arra ys (™) , with n > 1, k G {0, n} , tend toward Benford's 
law (jDiaconisl . Il977l ). whereas simple sequences such as (10") ngN obviously 
don't. 

In spite of all this, Benford's law is act ually used in the so -called "digital 
analysis" to detect anomalies in pricing (ISehitv et all 120051 ) or frauds, for 



i nstance in accounting r eports (IDrake and Nigrinil . |2000| ) or campaign finance 
(jCho and Gained . 120071 ). Faked data indeed usu ally depart f r om B enford's 



law more than real ones (IHilll . Il988l ). However, lHales et all (120081 ) advise 



caution, arguing that real data do not always fit the law. 

Many explanations have been put forward to elucidate the appearance of 
Benford's law on natural or mathematical data. Some authors foc us on par- 
ticula r random v ariables (lEngel and Leuenbergerl . 120031 ). sequence ( Uolissaintl . 



200 5]), real da t a (|Burke and Kincanonl . Il99ll ). or orbits of dynamical systems 



n r 



(IBerger et all 120041) . As a r ule, other explan ations assume special properties 
of the data. IHilll (Il995bl ) or iPinkhaml (Il96ll ) shows that scale invarianc e im- 
plies B enford's law. Base invariance is an o ther sufficient condition QHill, 
1995a ). Mixtures of uniform distributions ( Janvresse and Delarue . 20041 ) 
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also co nform to Benf ord's law, and so do the limits of some random pro- 



cesses f 


Shiireerl. 


20081). ] 


(Pietronero et al.. 


2001) 



Each of these explanations accounts for some ap- 
pearances of data fitting Benford's law, but lacks generality. 

While looking for a truly general explanation, some authors noticed that 
data sets are more likely to fit Benford's law if they were scattered enough. 
More precise ly, a sequence should "cover several orders of magnitude", as 



Raimil (119761 ) expressed it. Of course, scatter alone is no sufficient condition. 
The sequence 0.9, 9, 90, 900... indeed covers several orders of magnitude, but 
is far from conforming to Benford's law. The continuous random variables 
that are known to fit Benford's law usually present some "regularity": ex- 
ponential densities, normal densities, or lognormal densities are of this kind. 
Invariance assumptions (base-invariance or scale-invariance) lead to "regu- 
lar" densities and so do central limit-like theorem assumptions of mixture. 

Some technical explanations may be viewed as a mathematical expression 
of the idea that a random variable X is mo re likely to conform to B enford's 
law if it is regular and scattered enough. flMardia and Juppl . |2000L Exam- 
ple 4.1.) linked B enford's law to Poincare's theorem in circular statistics, 
and ISmithl (120071 ) expressed it in terms of Fourier transforms and signal 
processing. However, a non expert reader would hardly notice the smooth- 
and-scattered implications of these developments. 

Though scatter has been explicitly mentioned and regularity allusively 
evoked, the idea that scatter and regularity (in a sense that will be made clear 
further) may actually be a sufficient explanation for Benford's phenomenon 
related to continuous r.v.'s have never been formaliz e d in a simple way, to 
our knowledge, except in a recent article by iFewsterl (120091 ) . In this paper, 
Fewster hypothesizes that " any distribution [...] that is reasonably smooth 
and covers several orders of magnitude is almost guaranteed to obey Benford's 
law. " He then defines a smoothing procedure for a r.v. X based on [n 2 (x)}" , 
ii being the probability density function (henceforth p.d.f.) of log (X) , and 
illustrates with a few eloquent examples that under smoothness and scatter 
constraints, a r.v. cannot depart much from Benford's law. However, no 
theorem is given that would formalise this idea. 

In the first part of this paper, we prove a theorem from which it follows 
that scatter and regularity can be modelled in such a way that they, alone, 
imply rough compliance to Benford's law (again: real data usually do not 
perfectly fit Benford's law, irrespective of the sample size). 

It is not surprising that many data sets or random variables samples are 
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scattered and regular hence our explanation of Benford's phenomena corrob- 
orates a widespread intuition. The proof of this theorem is straightforward 
and requires only basic mathematical tools. Furthermore, as we shall see, 
several of the existing explanations can be understood as corollaries of ours. 
Our explanation encompasses more specific ones, and is far simpler to un- 
derstand and to prove. 

Scatter and regularity do not presuppose any log-related properties (such 
as the property of log-normality, scale-invariance, or multiplicative proper- 
ties). For this reason, if we are right, Benford's law should also admit other 
versions. We set that a r.v. X is -u-Benford for a function u if u(X) is uni- 
form mod 1. The classical Benford's law is thus a special case of w-Benford's 
law, with u = log. We test real data sets and mathematical sequences for "u- 
Benfordness" with various u, and test a second theorem echoing the first one. 
Most data conform to w-Benford's law for different u, which is an argument 
in favour of our explanation. 

2. Scatter and regularity: a key to Benford 

The basic idea at the root of theorem 1 (below) is twofold. 

First, we hypothesize that a continuous r.v. X with density / is almost 
uniform mod 1 as soon as it is scattered and regular. More precisely, any / 
that is non-decreasing on ] — oo, a], and then non-increasing on [a, +oo[ (for 
regularity) and such that its maximum m = sup(/) is "small" (for scatter) 
should correspond to a r.v. X approaching uniformity mod 1. Figure 1 il- 
lustrates this idea. 
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Figure 1 — Illustration of the idea that a regular p.d.f. is almost bound to 
give rise to uniformity mod 1. The stripes — restrictions of the density on 

[n, n + 1] - - of the p.d.f. of a r.v. X are stacked to form the p.d.f. of X 
mod 1. The slopes partly compensate, so that the resulting p.d.f. is almost 
uniform. If the initial p.d.f. is linear on every [n,n + 1] , the compensation 

is perfect. 

Second, note that if X is scattered and regular enough, so should be 
log (X). These two ideas are formalized and proved in theorem 1. 

Henceforth, for any real number x, [x\ will denote the greatest integer 
not exceeding x, and {x} = x — [x\ . Any positive x can be written as a 
product x = lOL^J.lO* 10 ^}, and the Benford's law may be rephrased as the 
uniformity of the random variable {log {X)} . 

Theorem 1. Let X be a continuous positive random variable with p.d.f. f 
such that Id.f : x i — > xf(x) conforms to the two following conditions : 
3a > such that (1) max(Id.f) — m — a.f(a) and (2) Id.f is nondecreasing 
on ]0, a], and nonincreasing on [a, +oo[. Then, for any z e]0, 1], 

\P ({logX} < z) - z\ < 2 In (10) m. 

In particular, (X n ) being a sequence of continuous r.v. 's with p.d.f. f n sat- 
isfying these conditions and such that m n = max (Id.f n ) — > 0, {log(X n )} 
converges toward uniformity on [0, 1[ in law. 



5 



PROOF. We first prove that for any continuous r.v. Y with density g such 
that g is nondecreasing on ] — oo, b], and then nonincreasing on [b, +00 [, the 
following holds: 

Vz e]0,l], \P({Y} <z)-z\<2M 

where M = g (b) = sup (g) . 

We may suppose without loss of generality that b e [0, 1[. Let z e]0, 1[ 
(the case z = is obvious). Put J nj2 = [n,n + z[. For any integer n < — 1, 

/■ /-n+l 

- / < / 

^ •/ In..z J n 



Thus 



- I 9(t)dt < [ g(t)dt. 



For any integer n > 2, 

pn+z 

g(t)dt< / ff (t)dt, 

J n—l+z 



-I 

Z Jln,z 



SO 



r r+oc 

-E / 3(^< / <?(*)^. 
Moreover, J* /q (7 < 2M and J T z g < zM. Hence, 
- Yl f 9< f 9 + 

Z J I n .z J -OO 



We prove in the same fashion that 



- [ 9> f g-2M. 



Since J2zli 9 = ^ (V^} < z ) > z < * anc ^ I^L 9 = tne resu ^ i s 
proved. 

Now, applying this to Y = log (X) proves theorem 1. 

Remark 1. The convergence theorem is still valid if we accept / to have a 
finite number of monotony changes, provided this number does not exceed a 
previously fixed k. The proof is straightforward. 

Remark 2. The assumptions made on Id.f may be seen as a measure of 
scatter and regularity for X, adjusted for our purpose. 
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3. Examples 

3.1. Type I Pareto 

A continuous r.v. X is type I Pareto with parameters a and xq (a, xq G 
iff it admits a density function 



fxo,a (x) 



X' 



o_ir 

a + l HX ,+QOI 



Besides its classical use in income and wealth modellin g:, type I Pareto 
variables arise in hydrology and astronomy (jPaolellal . 120061 . page 252). 
The function Id.f = g : x i — ► ^£-R{x ,oo{ is decreasing. Its maximum is 

sup (Id.f) = Id.f (x ) = a. 

Therefore, X is nearly Benford-like, in the extent that 

|P({logX} < z) - z\ < 2 ln(10)a. 

3.2. Type II Pareto 

A r.v. X is type II Pareto with parameter b > iff it admits a density 
function defined by 



fb(x) 



:i + x 



It arises in a so-called mixture model, with mixing components being 
gamma distributed r.v.'s sequences. 

The function Id.fb — g\, '■ x \ — >■ bx ,b+i \o,+oo[ is C°° , with derivative 

(l-\-x) ' 

b(l-bx) 

9b (x) 



6+2 ' 



(x + 1) 

which is positive whenever x < jk then negative. From this result we derive 



sup g b = g b \- 



1 



(1 + 1) 



1\ 1+6 



1 + 6 



6+1 



since 



In 



6+1 



(6+1) [ln6 -In (6 + 1)] 



.1 + 6 

which tends toward — oo when 6 tends toward 0, 

sup qh — ► 0. 

6 — »o 

Theorem 1 applies. It follows that X conform toward Benford's law when 
6 — >0. 
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3.3. Lognormal distributions 

A r.v. X is lognormal iff log (X) ~ X(/i, a 2 ). Lognormal distributions 
have been related to Benford (?). It is easy to prove that whenever a — ► oo, 
X tends toward Benford's law. Although the proof may use different tools, 
a straightforward way to do it is theorem 1. 

One classical explanation of Benford's la w is that many da t a sets are ac- 



tually built through multiplicative processes (IPietronero et all 120011 ). Thus, 
data may be seen as a product of many small effects. This may be modelled 
by a r.v. X that may be written as 



X = l[Y u 



Yi being a sequence of random variables. Using the log transformation, this 
leads to log (X) = £ log (^) . 

The multiplicative central-limit theorem therefore proves that, under usual 
assumptions, X is bound to be almost lognormal, with log (X) ~ X (/i, cr 2 ) , 
and a — > oo, thus roughly conforming to Benford, as an application of 
theorem 1. 



4. Generalizing Benford 

If we are right to think that Benford's law is to be understood as a conse- 
quence of mere scatter and regularity, instead of special characteristics linked 
with multiplicative, scale-invariance, or whatever log-related properties, we 
should be able to state, prove, and check on real data sets, a generalized 
version of the Benford's law were some function u replaces the log . 

Indeed, our basic idea is that X being scattered and regular enough im- 
plies log (X) to be scattered and regular as well, so that log (X) should be 
almost uniform mod 1. The same should be true of any u (X) , u being a 
function preserving scatter and regularity. Actually, some u should even be 
better shots than log, since log reduces scatter on [1, +oo[. 

First, let us set out a generalized version of theorem 1, the proof of which 
is closely similar to that of theorem 1. 

Theorem 2. Let X be a r.v. taking values in a real interval /, with p.d.f. 
f. Let u be a C 1 increasing function I — ► R ; such that ^ : x i — > 

conforms to the following: 3a > such that (1) max = m — -4 (a) and 
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(2) is non- decreasing on ]0, a], and non-increasing on [a, +oo[fl/. Then, 
for all z e [0, 1[, 

\P({u(X)} <z)-z\< 2m. 

In particular, if(X n ) is a sequence of such r.v. 's withp.d.f. f n and max (/„/«') = 
m n , and lim +00 (m n ) = 0, then {u(X n )} converges in law toward U ([0, 1[) 
when n — > oo. 

A r.v. X such that {u (X)} ~ U ([0, 1[) will be said u -Benford henceforth. 
4-1- Sequence 

Although our two theorems only apply to continuous r.v.'s, the underly- 
ing intuition that log-Benford's law is only a special case (having, however, a 
special interest thanks to its implication in terms of leading-digits interpre- 
tation) of a more general law does also apply to sequence. In this section, 
we experimentally test w-Benfordness for a few sequences (t>„) and a four 
functions u. 

We will use six mathematical sequences. Three of them, namely (?rn) neN , 
prime numbers (p n ), and (\/n) n&N are known not to follow Benford. The three 
others, (n n ) neN , (n\) nm and (e n ) neN conform to Benford. 

As for u, we will focus on four cases: 

x i — ► log [log (x)} 
X I — ► log (x) 
X I — > \fx 
x i — ► nx 2 

The first one increases very slowly, so we may expect that it will not work 
perfectly. The second leads to the classical Benford's law. The ir coefficient 
of the last u allows us to use integer numbers, for which {x 2 } is nil. 

The result of the experiment is given in Table 1. 





log o log (v n ) 


log (v n ) 


v 7 ^ 




^fn (N = 10 000) 


68.90 (.000) 


45.90 (.000) 


4.94 (.000) 


0.02 (.000) 


Tin (N = 10 000) 


44.08 (.000) 


26.05 (.000) 


0.19 (1.000) 


0.80 (.544) 


p n (N = 10 000) 


53.92 (.000) 


22.01 (0.000) 


0.44 (0,990) 


0.69 (.719) 


e n (N = 1 000) 


6.91 (0.000) 


0.76 (1.000) 


0.63 (.815) 


0.79 (.560) 


n\ (N = 1 000) w 


7.39 (.000) 


0.58 (.887) 


0.61 (.844) 


0.90 (.387) 


n n (N = 1 000) w 


7.45 (.000) 


0.80 (.543) 


16.32 (.000) 


0.74 (.646) 
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Table 1 — Results of the Kolmogorov-Smirnov tests applied on {u (v n )} , 
with four different functions u (columns) and six sequences (lines). Each 
sequence is tested through its first iV terms (from n — 1 to n — N), with an 
exception for log o log (n n ) and log o log (n\) , for which n — 1 is not 
considered. Each cell displays the Kolmogorov-Smirnov z and the 

corresponding p value. 

The sequences have been arranged according to the speed with which it 
converges to +oo (and so are the functions u). None of the six sequences is 
log o log-Benford (but a faster divergent sequence such as (lO 6 ™) would do). 
Only the last three are log-Benford. These are the sequences going to oo 
faster than any polynomial. Only one sequence (n n ) does not satisfy yC- 
Benfordness. However, this can be understood as a pathological case, since 
y/nP is integer whenever n is even, or is a perfect square. Doing the same 
Kolmogorov-Smirnov test with odd numbers not being perfect squares gives 
z = 0, 45 and p — 0, 987, showing no discrepancy with ^/7-Benfordness for 
{n n ) . All six sequences are 7r. 2 -Benford. 

Putting aside the case of y/nP, what Table 1 reveals is that the conver- 
gence speed of u (v n ) completely determines the w-Benfordness of (v n ) . More 
precisely, it seems that (v n ) is w-Benford whenever u (v n ) increases as fast 
as y/n, and is not -u-Benford whenever u (v n ) increase as slowly as In (n) . 
Of course, this rule-of-thumb is not to be taken as a theorem. Obviously 
enough, one can actually decide to increase or decrease convergence speed 
of u(v n ) without changing {u(v n )}, adding or substracting ad hoc integer 
numbers. 

Nevertheless, this observation suggests that we give a closer look at se- 
quence / (n) , where / is an increasing and concave real function converging 
toward oo, and look for a condition for ({f (n)}) n to converge to unifor- 
mity. An intuitive idea is that {{f (n)}) n will depart from uniformity if it 
does not increase fast enough: we may define brackets of integers — namely 
[f^ 1 (n) , f^ 1 (n + 1) — l[nN, within which [f (n)\ is constant, and of course 
{/ ( n )} increasing. If these brackets are "too large" , the relative height of the 
last considered bracket is so important that it overcomes the first terms of 
the sequence / (0) ,...,/ (n) mod I. In that case, there is no limit to the prob- 
ability distribution of ({/ (n)}) . The weight of the brackets should therefore 
be small relative to f^ 1 (n) , which may be written as 

/- 1 (n) - r 1 (n + 1) ; Q 

f- 1 (n) 
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Provided that / is regular, this leads to 



(r 1 ) / op : 

f- 1 (x) oo 

or 

[in (r 1 ^))]' — 0. 

oo 

Functions / : x i — > x a , a > satisfy this condition. Any n a should then 
show a uniform limit probability law, except for pathological cases (a 6 Q) . 
Taking a = - gives (with N = 1000), a Kolmogorov-Smirnov z = 1,331, 
and a p-value 0.058, which means there is no significant discrepancy from 
uniformity. On the other hand, the log function which does not conform to 
this condition is such that {log(n)} is not uniform, confirming once again 
our rule-of-thumb conjecture. 

4-2. Real data 

We test three data sets for w-Benfordness using a Kolmogorov-Smirnov 
test for uniformity. First data set is the opening value of the Dow Jones, 
the first day of each month from October 1928 to November 2007. The 
second and third are country areas expressed in millions of square-km 2 and 
the populations of the different countries, as estimated in 2008, expressed in 
millions of inhabitants. The two last sequences are provided by the CIA0. 
Table 2 displays the results. 





logo log (v n ) 


log K) 


v 7 ^ 




Dow Jones (N = 950) 


5.90 (.000) 


5.20 (.000) 


0.75 (.635) 


0.44 (.992) 


Area pays (N = 256) 


1.94 (.001) 


0.51 (.959) 


0.89 (.404) 


1,88 (.002) 


Populations (N = 242) 


3.39 (.000) 


0.79 (.568) 


0.83 (.494) 


0.42 (.994) 



Table 2 — Results of the Kolmogorov-Smirnov tests applied on {u (v n )} . 



This table confirms our analysis: classical Benfordness is actually less 
often borne out than ^/7-Benfordness on these data. The last column shows 
that our previous conjectured rule has exceptions: divergence speed is not an 
absolute criterion by itself. For country areas, the fast growing u : x i — ► irx 2 
gives a discrepancy from uniformity, whereas the slow- growing log does not. 
However, allowing for exceptions, it is still a good rule-of-thumb. 



http://www.cia.gov/library/publications/the-world-factbook/docs/rankorderguide.html 



11 



4-3. Continuous r.v.'s 

Our theorems apply on continuous r.v.'s. We now focus on three examples 
of such r.v.'s, with the same u as above (except for logo log, which is not 
defined everywhere on the uniform density on ]0, k] (k > 0), exponential 
density, and absolute value of a normal distribution. 

4-3.1. Uniform r.v.'s 

It is a known fact that a uniform distribution on ]0, k] (k > 0) does not 
approach classical Benfordness, even as a limit. On every bracket [10- 7-1 , 1CP — 
1[, the leading digit is uniform. Therefore, taking k = 10 j — 1 leads to 
a uniform (and not logarithmic) distribution for leading digits, whatever j 
might be. 

The density g k of ^/Xk is 

9k (x) = pi6 0, Vk 

and gu (x) = otherwise. It is an increasing function on ] — oo,\/k], 
decreasing on [y/h, +oo[ with maximum -4= — ► when k — > oo. Theorem 2 
applies, showing that X^ tends toward ^-Benfordness in law. Now, theorem 
3 below proves that X k tends toward w-Benfordness, when u (x) = irx 2 . 

Theorem 3. If X follows a uniform density on ]0,k], {ttX 2 } converges in 
law toward uniformity on [0, 1[ when k — ► oo. 

Proof. Let X ~ U (]0, k]) . The p.d.f. g of Y = nX 2 is 

where a = k\/n. 

The c.d.f. G of Y is then 

G f x ) = ^. t xe]0,a 2 ]. 
a 

Let now 5 e]0, 1[. Call P s the probability that {Y} < 5. 

E G{j + S)-G{j)<P S < E G{j + S)-G{j) 

3=0 3=0 
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[a 2 -<5j [a 2 -8\+l 



3=0 



3=0 



The square-root function being concave, 



> 



and, for any j > 0, 
Hence, 



6 



2VJ' 



\a 2 -S\ 

— > , < P s < - 

2« VT+S - S ~a 



[a 2 -<5j+l 



5_ L ^ J i_ < < A v — 



x 1 — > ^ being decreasing, 



£ 

j=0 



[a 2 -5j +1+5 



> 



J -j=dt > 2 [v/[a 2 -5J +1 + 5 - v^] 



and 



So, 



|y-sj+i 



[a 2 -«5j+l 1 



vV-<5J + i]. 



5 r 



V|a 2 -5J + 1 + 5- 



< < - 



VLa 2 -^ + 1 



As a consequence, for any fixed 5, lim a >oc (Ps) = 5, and {vrX 2 } converges 

in law to uniformity on [0, 1[. 
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4-3.2. Exponential r.v. 's 

Let X\ be an exponential r.v. with p.d.f. f\ (x) = A exp (—Xx) (x > 0, A > 0) . 
Engel and Leuenberger [2003] demonstrated that X\ tends toward the Ben- 
ford's law when A — > 0. 

The p.d.f. of is x i — >■ 2Axexp (—Ax 2 ) , which increases on ]0, ^] 
and then decreases. Its maximum is exp (— ^) • Theorem 2 thus applies, 
showing that X\ is ^/7-Benford as a limit when A — > 0. 

Finally, theorem 4 below demonstrates that X\ tends toward w-Benfordness 
for u (x) = Tix 2 as well. 

Theorem 4. If X ~ EXP (A) (with p.d.f. f : x i — ► A exp (-Ax)), ^en 
F = 7rX 2 converges toward uniformity mod 1 when A — > 0. 

Proof. Let X be such a r.v. F = 7rX 2 has density g with 

5 (x) = ^= exp (-fiy/x) , x > 

where /x = The F c.d.f. G is thus, for all x > 

G (x) = 1 - e~^. 
Let P<5 denote the probability that {F} < 5, for 5 e]0, 1[. 



oo 



x i — > exp (—/iy/x) being convex, 



for any j > 0, and 



for any j > 0. Thus 
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x i — > ^ exp (—fj,y/x) being decreasing, 



j=0 



/i 



> 5 



and 



2v^ 



The two expressions tend toward 5 when /x 
proof is complete. 



-> 0, so that Pa 



5. The 



^.5.5. Absolute value of a normal distribution 

To test the absolute value of a normal distribution X with mean and 
variance 10 s , we picked a sample of 2000 values and used the same procedure 
as for real data. It appears, as shown in Table 3, that X significantly departs 
from w-Benfordness with u = log and u = tc. 2 , but not with u = ^J~.. 





log(X) 


Vx 


ttX 2 


U([0,k[) k — >oo 


NO 


YES 


YES 


EXP (A) A — > 


YES 


YES 


YES 


|AT(0, 10«)| 


14.49 (.000) 


0.647 (.797) 


28.726 (.000) 


Table 3 — The table displays if uni 


"orm distributions, exponentia 



distributions, and absolute value of a normal distribution, are w-Benford for 
different functions u, or not. The last line shows the results (and p-values) 
of the Kolmogorov-Smirnov tests applied to a 2000-sample. It could be read 

as "NO; YES; NO". 

As we already noticed, the best shot when one is looking for Benford 
seems to be the square-root rather than log . 
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5. Discussion 

Random variables exactly conforming the Benford's classical law are rare, 
although many do roughly approach the law. Indeed, many explanations have 
been proposed for this approximate law to hold so often. These explanations 
involve complex characteristics, sometimes directly related to logarithms, 
sometimes through multiplicative properties. 

Our idea — formalized in theorem 1 — is more simple and general. The 
fact that real data often are regular and scattered is intuitive. What we 
proved is an idea which has been recently expressed by Fewster [2009]: scatter 
and regularity are actually sufficient condition to Benfordness. 

This fact thus provides a new explanation of Benford's law. Other expla- 
nations, of course, are acceptable as well. But it may be argued that some 
of the most popular explanations are in fact corollaries of our theorem. As 
we have seen when studying Pareto type II density, mixtures of distributions 
may lead to regular and scattered density, to which theorem 1 applies. Thus, 
we may argue that a mixture of densities is nearly Benford because it is nec- 
essarily scattered and regular. In the same fashion, multiplications of effects 
lead to Benford-like densities, but also (as the multiplicative central-limit 
theorem states) to regular and scattered densities. 

Apart from the fact that our explanation is simpler and (arguably) more 
general, a good argument in its favor is that Benfordness may be generalized 
- unlike log-related explanations. Scale invariance or multiplicative proper- 
ties are log-related. But as we have seen, Benfordness is not dependant on 
log, and can easily be generalized. Actually, it seems that square root is a 
better candidate than log. The historical importance of log-Benfordness is 
of course due to the implications in terms of leading digits which bears no 
equivalence with square-root. 
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